Draft genome sequence data of methicillin-resistant Staphylococcus aureus, strain 4233

Staphylococcus aureus is a conditionally pathogenic microorganism and one of the main causative agents of antibiotic resistant nosocomial infections. In immunocompromised people, S. aureus infection can cause folliculitis, furuncles, impetigo, osteomyelitis, septic arthritis, sepsis, endocarditis, pneumonia and meningitis. In the presented work, sequencing of a methicillin-resistant S. aureus, strain 4233, was performed on the Illumina MiSeq platform, followed by bioinformatics processing and gene annotation using SPAdes, RAST and CARD programs and databases. The submitted genome is a total of 2,790,390 bp long and contains 2759 genes, including 82 RNA genes. 33 % of the genes are functionally significant and represent 25 functional groups. Fourteen genes encoding resistance factors to 14 different types of antibacterial drugs were predicted. The information provided on the genome of S. aureus, strain 4233 will be of value in investigating the evolution and formation of antibiotic-resistant forms of S. aureus.


a b s t r a c t
Staphylococcus aureus is a conditionally pathogenic microorganism and one of the main causative agents of antibiotic resistant nosocomial infections.In immunocompromised people, S. aureus infection can cause folliculitis, furuncles, impetigo, osteomyelitis, septic arthritis, sepsis, endocarditis, pneumonia and meningitis.In the presented work, sequencing of a methicillin-resistant S. aureus , strain 4233, was performed on the Illumina MiSeq platform, followed by bioinformatics processing and gene annotation using SPAdes, RAST and CARD programs and databases.The submitted genome is a total of 2,790,390 bp long and contains 2759 genes, including 82 RNA genes.33 % of the genes are functionally significant and represent 25 functional groups.Fourteen genes encoding resistance factors to 14 different types of antibacterial drugs were predicted.The information provided on the genome of S. aureus , strain 4233 will be of value in investigating the evolution and formation of antibiotic-resistant forms of S. aureus . ©

Value of the Data
• The presented data provide information on the whole-genome sequence of methicillinresistant S. aureus isolate, strain 4233 isolated from the wastewater of the hospital of Almaty city, Kazakhstan.• Methicillin-resistant pathogenic strains of S. aureus can cause dangerous diseases that are difficult to treat.In this regard, the presented data may contribute to a better understanding of genome variability, adaptive capacity, resistance mechanisms, and genetic differences between S. aureus strains and contribute to the development of more effective control and prevention strategies against this pathogen.• The presented data can be useful for the scientific and medical community, find their application in research on microbiology, medicine, molecular biology, bacterial genomics.• The obtained data are publicly available in NCBI databases and can be used as a basis for evolutionary research using reverse genetics methods, for studying the pathways of resistance formation in pathogenic forms of S. aureus and identification of genetic determinants of pathogenicity.

Background
S. aureus is a conditionally pathogenic microorganism and one of the main causative agents of nosocomial infections.In immunocompromised people, S. aureus can cause the development of both mild inflammatory processes and serious diseases [ 1 , 2 ].In addition, S. aureus can lead to the development of food poisoning if enter the body with food [ 3 , 4 ].The fight against S. aureus infections is complicated by the development of multiple drug resistance, which significantly reduces the effectiveness of antibiotic therapy.According to WHO data, from 20 % to 80 % of hospital-acquired infections are caused by antibiotic-resistant strains of S. aureus [ 5 ].
In order to solve the problem of antibiotic resistance, it is necessary to make a comprehensive study of drug-resistant bacterial strains.It is known that methicillin-resistant S. aureus strains responsible for the development of hospital-acquired infections are capable of genetic adaptation [ 6 , 7 ].Therefore, studying the genome of strains of this microorganism isolated in different countries and continents will contribute to the understanding of the complex host-pathogen interactions and pave the way for the treatment of people infected with MRSA today.In our study, we present a draft genome of methicillin-resistant S. aureus isolated near a hospital in Kazakhstan.

Data Description
This work presents a draft genome sequence of a methicillin-resistant S. aureus , strain 4233 ( Table 1 , Fig. 1 ).This strain was isolated from the wastewater of the Central Clinical Hospital JSC in Almaty.The presented genome is a total of 2790,390bp long and contains 2759 genes, including 82 RNA genes.
Annotation in the RAST program enabled us to determine that 33 % of the genes were functionally significant and represented 25 functional groups ( Fig. 2 ) [ 8 ].The largest number of genes, more than 100, belonged to the following groups: Amino Acids and Derivatives, Carbohydrates and Protein Metabolism -227, 170 and 158 genes, respectively.The least number of genes, less than 10, belonged to the groups Transposable Elements (Phages, Prophages, Plasmids), Dormancy and Sporulation -8 genes, Cell Di-vision and Cell Cycle and Potassium metabolism -5 genes, Secondary Metabolism and Metabolism of Aromatic Compounds -4 and 3 genes, respectively.In the remaining 16 functional groups, the number of genes varied from 10 to 98.
To perform phylogenetic analysis, we used the sequence of protein A ( Fig. 3 ), which is one of the main genetic markers of S. aureus [ 9 ].The selection of strains for comparison was chosen to include the strains of the pathogen most frequently encountered in clinical samples and samples isolated from animals [ 10 , 11 ].
It is shown that three branches of evolution are clearly distinguished among the dominant groups of clinical strains of S. aureus belonging to human and animal representatives.The studied strain clearly corresponds to the dominant group of human strains belonging to groups ST 45, 30 and 15.
The presence of antibiotic resistance genes in the genome of S. aureus , strain 4233 was determined using the Comprehensive Antibiotic Resistance Database (CARD; card.mcmaster.ca).The predicted resistance genes in the genome are presented in Table 2 .Fourteen genes encoding resistance factors to 14 different types of antibacterial drugs were found.
Methicillin-resistant strains of S. aureus are widespread throughout the world and account for up to 1 % of all isolated strains of the microorganism [ 12 ].Their ability to form epidemi-   cally significant variants due to genome adaptability creates serious challenges for public health.Therefore, the determination of genome features of newly isolated strains is important for studying the evolution of this group of microorganisms.In our research we presented a draft genome of S. aureus , strain 4233, isolated from the sewage of the Central Clinical Hospital in Almaty, Kazakhstan.The peculiarity of the studied strain is that 14 additional resistance genes to antibiotics with different action mechanisms were found in its genome.Thus, the draft genome of S. aureus , Strain 4233 can serve as an aid for researchers studying the spread of MRSA in different countries.

Sample collection and isolation of staphylococcus
The water sample was obtained from hospital wastewater in Almaty, Kazakhstan (43.24776402N 76.94454847 E).The wastewater sample was placed in a sterile 500 ml bottle and transported in a refrigerated hold at + 4 °C to the laboratory for analysis.Petri dishes containing mannitol-salt agar (CondaLab, Spain) were inoculated with 1 ml of the wastewater sample and incubated for 18-24 h.At the end of the incubation time, a single colony with typical S. aureus morphology was selected for further study.Further identification of S. aureus was based mainly on the colony morphology, Gram staining and determination of catalase and coagulase activity.To detect MRSA, the bacterial suspension of S. aureus was plated on chromogenic MRSA agar with cefoxitin MRSA supplement (CondaLab, Spain) and incubated aerobically at 35 ±2 °C for 24-48 h.Methicillin resistance was determined by the growth of a bacterial culture in the form of blue-colored colonies [ 13 , 14 ].
The purity of the isolated strains was confirmed using the standard microbiological method.The culture was stored in 20 % glycerol stock at −80 °C.

DNA isolation, genome sequencing, assembly, and annotation
For genomic DNA isolation, S. aureus , strain 4233 was grown in 5 ml TSB (Condalab, Spain) at 37 °C for 18-20 h.After that bacterial cells were precipitated by centrifugation at 60 0 0 rpm for 30 min.The supernatant was discarded and the cells were resuspended in sterile phosphate buffered saline and subjected to genomic DNA isolation using the PureLink TM Genomic DNA Mini Kit (ThermoFisher Scientific, Waltham, MA, USA) according to the supplier ʼs instructions.
The genome sequencing of S. aureus , strain 4233 was performed using Illumina Miseq platform and Miseq kit v3 (Illumina, Cambridge, UK) which allows to obtain 300 bp long pairedend reads.The library was obtained using the Nextera XT DNA library preparation kit (Illumina, Cambridge, UK).
Raw read adapters were trimmed using Trimmomatic 0.38.0 software [ 15 ].Sequences of low quality ( < Q30) were removed, after which, the remaining reads were on average 50-250 bp in length.De novo assembly of quality-controlled reads was performed using SPAdes 3.12.0[ 16 ].The quality of the assembled genome was determined by comparison with the reference genome using the Geneious Prime program version 2023.Annotation of the assembled genome was carried out using the NCBI Prokaryotic Genome Annotation Pipeline (PGAP), GeneMarkS-2 + , RAST and Bacta [ 17 , 18 ].The Resistance Gene Identifier (RGI v5.2.0) was used to detect AMR genes [ 19 ].For phylogenetic analysis, we used the pro-gram for constructing trees based on the nearest neighbor after performing the MAFFT (Multiple Alignment using Fast Fourier Transform) alignment, which is part of the Geneious Prime 2023 software.
The raw genome sequencing data of Illumina MiSeq were submitted to NCBI SRA database in FASTQ format: SRS20908588, with BioSample: SAMN37344193, under BioProject PRJNA1014944.The assembled genome is available in the NCBI GeneBank under NZ_CP134071.1 [ 20 ].

Fig. 2 .
Fig. 2. Subsystem statistics information on genome S. aureus , Strain 4233 obtained using RAST annotation.The subsystems categories and corresponding counts are presented in the legend.

Fig. 3 .
Fig. 3. Phylogeny of dominant S. aureus strains among nosocomial infections based on the protein A sequence model.